Search CORE

20 research outputs found

Distributed Signal Processing Algorithms for Wireless Networks

Author: Xu Songcen
Publication venue: University of York
Publication date: 01/07/2015
Field of study

Distributed signal processing algorithms have become a key approach for statistical inference in wireless networks and applications such as wireless sensor networks and smart grids. It is well known that distributed processing techniques deal with the extraction of information from data collected at nodes that are distributed over a geographic area. In this context, for each specific node, a set of neighbor nodes collect their local information and transmit the estimates to a specific node. Then, each specific node combines the collected information together with its local estimate to generate an improved estimate. In this thesis, novel distributed cooperative algorithms for inference in ad hoc, wireless sensor networks and smart grids are investigated. Low-complexity and effective algorithms to perform statistical inference in a distributed way are devised. A number of innovative approaches for dealing with node failures, compression of data and exchange of information are proposed and summarized as follows: Firstly, distributed adaptive algorithms based on the conjugate gradient (CG) method for distributed networks are presented. Both incremental and diffusion adaptive solutions are considered. Secondly, adaptive link selection algorithms for distributed estimation and their application to wireless sensor networks and smart grids are proposed. Thirdly, a novel distributed compressed estimation scheme is introduced for sparse signals and systems based on compressive sensing techniques. The proposed scheme consists of compression and decompression modules inspired by compressive sensing to perform distributed compressed estimation. A design procedure is also presented and an algorithm is developed to optimize measurement matrices. Lastly, a novel distributed reduced-rank scheme and adaptive algorithms are proposed for distributed estimation in wireless sensor networks and smart grids. The proposed distributed scheme is based on a transformation that performs dimensionality reduction at each agent of the network followed by a reduced–dimension parameter vector

White Rose E-theses Online

Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network

Author: Liu Jianzhuang
Liu Shuaicheng
Liu Zhen
Wang Yinglong
Xu Songcen
Publication venue
Publication date: 16/08/2023
Field of study

This paper presents a novel network structure with illumination-aware gamma correction and complete image modelling to solve the low-light image enhancement problem. Low-light environments usually lead to less informative large-scale dark areas, directly learning deep representations from low-light images is insensitive to recovering normal illumination. We propose to integrate the effectiveness of gamma correction with the strong modelling capacities of deep networks, which enables the correction factor gamma to be learned in a coarse to elaborate manner via adaptively perceiving the deviated illumination. Because exponential operation introduces high computational complexity, we propose to use Taylor Series to approximate gamma correction, accelerating the training and inference speed. Dark areas usually occupy large scales in low-light images, common local modelling structures, e.g., CNN, SwinIR, are thus insufficient to recover accurate illumination across whole low-light images. We propose a novel Transformer block to completely simulate the dependencies of all pixels across images via a local-to-global hierarchical attention mechanism, so that dark areas could be inferred by borrowing the information from far informative regions in a highly effective manner. Extensive experiments on several benchmark datasets demonstrate that our approach outperforms state-of-the-art methods.Comment: Accepted by ICCV 202

arXiv.org e-Print Archive

Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images

Author: Deng Jiankang
Guo Yuanfan
Han Jianhua
Li Ying
Xu Hang
Xu Songcen
Zheng Qingping
Publication venue
Publication date: 31/08/2023
Field of study

Stable diffusion, a generative model used in text-to-image synthesis, frequently encounters resolution-induced composition problems when generating images of varying sizes. This issue primarily stems from the model being trained on pairs of single-scale images and their corresponding text descriptions. Moreover, direct training on images of unlimited sizes is unfeasible, as it would require an immense number of text-image pairs and entail substantial computational expenses. To overcome these challenges, we propose a two-stage pipeline named Any-Size-Diffusion (ASD), designed to efficiently generate well-composed images of any size, while minimizing the need for high-memory GPU resources. Specifically, the initial stage, dubbed Any Ratio Adaptability Diffusion (ARAD), leverages a selected set of images with a restricted range of ratios to optimize the text-conditional diffusion model, thereby improving its ability to adjust composition to accommodate diverse image sizes. To support the creation of images at any desired size, we further introduce a technique called Fast Seamless Tiled Diffusion (FSTD) at the subsequent stage. This method allows for the rapid enlargement of the ASD output to any high-resolution size, avoiding seaming artifacts or memory overloads. Experimental results on the LAION-COCO and MM-CelebA-HQ benchmarks demonstrate that ASD can produce well-structured images of arbitrary sizes, cutting down the inference time by 2x compared to the traditional tiled algorithm

arXiv.org e-Print Archive

Fuse Your Latents: Video Editing with Multi-source Latent Diffusion Models

Author: Gu Jiaxi
Lu Tianyi
Pei Renjing
Wu Zuxuan
Xu Hang
Xu Songcen
Zhang Xing
Publication venue
Publication date: 25/10/2023
Field of study

Latent Diffusion Models (LDMs) are renowned for their powerful capabilities in image and video synthesis. Yet, video editing methods suffer from insufficient pre-training data or video-by-video re-training cost. In addressing this gap, we propose FLDM (Fused Latent Diffusion Model), a training-free framework to achieve text-guided video editing by applying off-the-shelf image editing methods in video LDMs. Specifically, FLDM fuses latents from an image LDM and an video LDM during the denoising process. In this way, temporal consistency can be kept with video LDM while high-fidelity from the image LDM can also be exploited. Meanwhile, FLDM possesses high flexibility since both image LDM and video LDM can be replaced so advanced image editing methods such as InstructPix2Pix and ControlNet can be exploited. To the best of our knowledge, FLDM is the first method to adapt off-the-shelf image editing methods into video LDMs for video editing. Extensive quantitative and qualitative experiments demonstrate that FLDM can improve the textual alignment and temporal consistency of edited videos

arXiv.org e-Print Archive

Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images

Author: Li Huibin
Liang Xiaodan
Lu Guansong
Sun Jian
Xu Hang
Xu Songcen
Xu Zongben
Yu Cuican
Zeng Yihan
Zhang Wei
Publication venue
Publication date: 31/08/2023
Field of study

Generating 3D faces from textual descriptions has a multitude of applications, such as gaming, movie, and robotics. Recent progresses have demonstrated the success of unconditional 3D face generation and text-to-3D shape generation. However, due to the limited text-3D face data pairs, text-driven 3D face generation remains an open problem. In this paper, we propose a text-guided 3D faces generation method, refer as TG-3DFace, for generating realistic 3D faces using text guidance. Specifically, we adopt an unconditional 3D face generation framework and equip it with text conditions, which learns the text-guided 3D face generation with only text-2D face data. On top of that, we propose two text-to-face cross-modal alignment techniques, including the global contrastive learning and the fine-grained alignment module, to facilitate high semantic consistency between generated 3D faces and input texts. Besides, we present directional classifier guidance during the inference process, which encourages creativity for out-of-domain generations. Compared to the existing methods, TG-3DFace creates more realistic and aesthetically pleasing 3D faces, boosting 9% multi-view consistency (MVIC) over Latent3D. The rendered face images generated by TG-3DFace achieve higher FID and CLIP score than text-to-2D face/image generation models, demonstrating our superiority in generating realistic and semantic-consistent textures.Comment: accepted by ICCV 202

arXiv.org e-Print Archive